Comparing Ontology-Based and Corpus-Based Domain Annotations in WordNet
نویسندگان
چکیده
Domain information has been regarded as an emerging topic of interest in relation to WordNet. A lexical resource, WordNet Domains, is presented, where WordNet synsets have been annotated with domain labels such as Medicine, Architecture and Sport. This annotation reflects the lexicosemantic criteria adopted by humans involved in the annotation. However, from a corpus-based perspective, domains reflect term distribution in a given text collection. The paper proposes a preliminary investigation aiming at comparing and integrating ontology-based and corpus-based domain information.
منابع مشابه
Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملAutomated Alignment and Extraction of a Bilingual Ontology for Cross-Language Domain-Specific Applications
This paper presents a novel approach to ontology alignment and domain ontology extraction from two existing knowledge bases: WordNet and HowNet. These two knowledge bases are automatically aligned to construct a bilingual ontology based on the co-occurrence of words in a bilingual parallel corpus. The bilingual ontology achieves greater structural and semantic information coverage from these tw...
متن کاملCorpus+WordNet thesaurus generation for ontology enriching
This paper presents a model to enrich an ontology with a thesaurus based on a domain corpus and WordNet. The model is applied to the data privacy domain and the initial domain resources comprise a data privacy ontology, a corpus of privacy laws, regulations and guidelines for projects. Based on these resources, a thesaurus is automatically generated. The thesaurus seeds are composed by the onto...
متن کاملAutomated Alignment and Extraction of Bilingual Domain Ontology for Cross-Language Domain-Specific Applications
In this paper we propose a novel approach for ontology alignment and domain ontology extraction from the existing knowledge bases, WordNet and HowNet. These two knowledge bases are aligned to construct a bilingual ontology based on the cooccurrence of the words in the sentence pairs of a parallel corpus. The bilingual ontology has the merit that it contains more structural and semantic informat...
متن کاملAutomatic Alignment and Extraction of Bilingual Domain Ontology for Medical Domain Web Search
This paper proposes an approach to automated ontology alignment and domain ontology extraction from two knowledge bases. First, WordNet and HowNet knowledge bases are aligned to construct a bilingual universal ontology based on the co-occurrence of the words in a parallel corpus. The bilingual universal ontology has the merit that it contains more structural and semantic information coverage fr...
متن کامل